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Abstract 

We present a robust approach 
for hnking aheady existing lexi- 
cal/semantic hierarchies. We used 
a constraint satisfaction algorithm 
(relaxation labeling) to select - 



for supporting the Pangloss machine transla- 



tion system. In ( Okumura and Hovy, 1994 ) 
(semi) automatic methods for associating a 
Japanese lexicon to an English ontology us- 
ing a bilingual dictionary are described. Sev- 
eral experiments aligning EDR and Word- 



Net ontologies are described in ( Utiyama and 



among a set of candidates- the node 



Hasida, 1997). Several lexical resources and 



in a target taxonomy that bests 



matches each node in a source tax- 



onomy. In particular, we use it to 



map the nominal part of WordNet 



1.5 onto WordNet 1.6, with a very 
high precision and a very low re- 
maining ambiguity. 



li — Introduction 



There is an increasing need of having available 
general, accurate and broad coverage multi- 



lingual lexical/semantic resources for devel- 
oping NL applications. Thus, a very active 
field inside nl during the last years has been 
the fast development of generic language re- 
sources. 

Several attempts have been performed to 
connect already existing ontologies. In 



(Ageno et al., 1994), a Spanish/English bilin- 
gual dictionary is used to (semi) automatically 
link Spanish and English taxonomies ex- 



tracted from DGILE ( |Alvar, 19871) and ldoce 
(Procter, 1987). Similarly, a simple au- 
tomatic approach for linking Spanish tax- 
onomies extracted from DGILE to WordNet 
( Miller et al., 199l|) synsets is proposed in 
( Rigau et al., 1995] ). The work reported in 
( Knight and Luk, 1994| ) focuses on the con- 
struction of Sensus, a large knowledge base 



techniques are combined in ( [Atserias et al. 



1997D to map Spanish words from a bilingual 



dictionary to WordNet, and in ( [Farreres et 



al., 199q ) the use of the taxonomic structure 



derived from a monolingual MRD is proposed 
as an aid to this mapping process. 

The use of relaxation labeling algorithm 
to attach substantial fragments of the Span- 
ish taxonomy derived from DGILE ( [Rigau et 



al., 199q ) to the English WordNet using a 



bilingual dictionary for connecting both hi- 
erarchies, has been reported in ( paude et al. 



1999). 



In this paper we use the same technique to 
map wnI.5 to wn1.6. The aim of the experi- 
ment is twofold: First, show that the method 
is general enough to link any pair of ontolo- 
gies. Second, evaluate our taxonomy link- 
ing procedure, by comparing our results with 
other wnI.5 to wn1.6 existing mappings. 

This paper is organized as follows: In sec- 
tion |2| we describe the used technique (the 
relaxation labeling algorithm) and its appli- 
cation to hierarchy mapping. In section |3| we 
describe the constraints used in the relaxation 
process, and finally, after presenting some ex- 
periments and results, we offer some conclu- 
sions and outline further lines of research. 



2 Application of Relaxation 
Labeling to nlp 

Relaxation labeling (rl) is a generic name for 
a family of iterative algorithms which perform 
function optimization, based on local infor- 
mation. See (Torras, 1989) for a summary. 
Its most remarkable feature is that it can deal 
with any kind of constraints, thus, the model 
can be improved by adding any constraints 
available, and the algorithm is independent 
of the complexity of the model. That is, we 
can use more sophisticated constraints with- 
out changing the algorithm. 

The algorithm has been applied to POS 
tagging ( [Marquez and Padro, 1997 ), shallow 
parsing ( [Voutilainen and Padro, 1997]) an d to 
word sense disambiguation ( Padro, 1998| ). 

Although other function optimization al- 
gorithms could have been used (e.g. ge- 
netic algorithms, simulated annealing, etc.), 
we found RL to be suitable to our purposes, 
given its ability to use models based on con- 
text constraints, and the existence of previous 
work on applying it to nlp tasks. 

Detailed explanation of the algorithm can 
be found in ( [Torras, 1989| ), while its applica- 
tion to NLP tasks, advantages and drawbacks 



are addressed in (Padro, 1998). 



2.1 Algorithm Description 

The Relaxation Labeling algorithm deals with 
a set of variables (which may represent words, 
synsets, etc.), each of which may take one 
among several different labels (pos tags, 
senses, mrd entries, etc.). There is also a 
set of constraints which state compatibility 
or incompatibility of a combination of pairs 
variable-label. 

The aim of the algorithm is to find a weight 
assignment for each possible label for each 
variable, such that (a) the weights for the 
labels of the same variable add up to one, 
and (b) the weight assignment satisfies -to 
the maximum possible extent- the set of con- 
straints. 

Summarizing, the algorithm performs con- 
straint satisfaction to solve a consistent label- 
ing problem. The followed steps are: 



1. Start with a random weight assignment. 

2. Compute the support value for each label 
of each variable. Support is computed ac- 
cording to the constraint set and to the 
current weights for labels belonging to 
context variables. 

3. Increase the weights of the labels more 
compatible with the context (larger sup- 
port) and decrease those of the less com- 
patible labels (smaller support). Weights 
are changed proportionally to the sup- 
port received from the context. 

4. If a stopping/convergence criterion is sat- 
isfied, stop, otherwise go to step 2. We 
use the criterion of stopping when there 
are no more changes, although more so- 
phisticated heuristic procedures may also 
be used to stop relaxation processes (|Ek- 



lundh and Rosenfeld, 1978; Richards et 
T98l| V 



al 



The cost of the algorithm is proportional to 
the product of the number of variables by the 
number of constraints. 

2.2 Application to taxonomy 
mapping 

As described in previous sections, the problem 
we are dealing with is to map two taxonomies. 
In this particular case, we are interested in 
mapping WNl.5 to WNl.6, that is, assign each 
synset of the former to at least one synset of 
the later. 

The modeling of the problem is the follow- 
ing: 

• Each wnI.5 synset is a variable for the 
relaxation algorithm. We will refer to it 
as source synset and to wn1.5 as source 
taxonomy. 

• The possible labels for that variable are 
all the wnI.6 synsets which contain a 
word belonging to the source synset. We 
will refer to them as target synsets and 
to WNl.6 as target taxonomy. 

• The algorithm will need constraints stat- 
ing whether a wn1.6 synset is a suitable 



assignment for a wn1.5 synset. As de- 
scribed in section^ these constraints will 
rely on the taxonomy structure. 

3 The Constraints 

Constraints are used by the relaxation la- 
beling algorithm to increase or decrease the 
weight for a variable label. In our case, con- 
straints increase the weights for the connec- 
tions between a source synset and a target 
synset. Increasing the weight for a connec- 
tion implies decreasing the weights for all the 
other possible connections for the same source 
synset. To increase the weight for a connec- 
tion, constraints look for already connected 
nodes that have the same relationships in 
both taxonomies. 

Although there is a wide range of relation- 
ships between WordNet synsets which can 
be used to build constraints, we have fo- 
cused on the hyper/hyponym relationships. 
That is, we increase the weight for a con- 
nection when the involved nodes have hyper- 
nyms/hyponyms also connected. We consider 
hyper/hyponym relationships either directly 
or indirectly (i.e. ancestors or descendants), 
depending on the kind of constraint used. 

Figure ffl shows an example of possible con- 
nections between two taxonomies. Connec- 
tion C4 will have its weight increased due to 
C5, Ce and Ci, while connections C2 and C3 
will have their weights decreased. 
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Figure 1: Example of connections between tax- 
onomies. 

We distinguish different kinds of con- 
straints, depending on whether we consider 
hyponyms, hypernyms or both, on whether 



we consider those relationships direct or indi- 
rect, and on in which of both taxonomies we 
do so. Each constraint may be used alone or 
combined with others. 

Below we describe all kinds of constraint 
used. They are labeled with a three-character 
code (xYz), which must be read as follows: 
The first character (x) indicates how the hy- 
per/hyponym relationship is considered in the 
source taxonomy: only for immediate nodes 
(i) or for any (a) ancestor /descendant. The 
second character (y) codes the same informa- 
tion for the target taxonomy side. The third 
character indicates whether the constraints 
requires the existence of a connected hyper- 
nym (e), hyponym (o), or both (b). 

HE constraint: The simplest constraint is 
to check whether the connected nodes 
have respective direct hypernyms also 
connected. he stands for immediate 
source(l), immediate target (l) hypernym 
(E). 

This constraint will increase the weights 
for those connections in which the im- 
mediate hypernym of the source node is 
connected with the immediate hypernym 
of the target node. 

no constraint: This constraint increases the 
weight for that connections in which an 
immediate hyponym of the source node 
is connected to an immediate hyponym 
of the target node. 

IIB constraint: This constraint increases the 
weight for the connections in which the 
immediate hypernym of the source node 
is connected to the immediate hypernym 
of the target node and an immediate hy- 
ponym of the source is connected to an 
immediate hyponym of the target. 

II constraints. If we use constraints he, iio 
and IIB at the same time, weights will be 
modified for words matching any of the 
constraints. That is, we are additively 
combining constraints. In the case where 
two of them apply, their effects will be 
added. If they have opposite effects, they 



will cancel each other. Figure ^ shows 
a graphical representation of all ll con- 
straints. 



ponym will have their weights doubly in- 
creased. Figure y shows a graphical rep- 
resentation of all Al constraints. 
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Figure 2: n constraints. 



Figure 3: AI constraints. 



The arrows indicate an immediate hyper- 
nymy relationship. The nodes on the left 
hand side correspond to the source taxonomy 
and the nodes on the right to the target hier- 
archy. The dotted line is the connection which 
weight will be increased due to the existence 
of the connection indicated with a continuous 
line. 

AIE constraint: This constraint increases 
the weight for the connections in which 
an ancestor of the source node is con- 
nected to the immediate hypernym of the 
target node. 

Aio constraint: This constraint increases 
the weight for the connections in which 
a descendant of the source node is con- 
nected to an immediate hyponym of the 
target node. 

AIB constraint: This constraint increases 
the weight for the connections in which 
an ancestor of the source node is con- 
nected to the immediate hypernym of 
the target node and a descendant of the 
source node is connected to an immediate 
hyponym of the target node. 

AI constraints. If we use constraints AiE, 
AIO and AIB simultaneously, we apply ei- 
ther a hypernym constraint, either a hy- 
ponym constraint or either both of them. 
In the last case, the joint constraint is 
also applied. This means than connec- 
tions with matching hypernym and hy- 



In this figure, the -|- sign indicates that the 
hypernymy relationship represented by the 
arrow does not need to be immediate. In 
this case, this iteration is only allowed in the 
source taxonomy. 

lA constraints: Are symmetrical to Ai con- 
straints. In this case, recursion is allowed 
only on the target taxonomy. 

Figure ^ shows a graphical representation 
of all I A constraints. 
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Figure 4: ia constraints. 



AA constraints: Include the above combi- 
nations, but allowing recursion on both 
sides. 

Figure |5| shows a graphical representation 
of all AA constraints. 

4 Experiments and Results 

In the performed tests we used simultaneously 
all constraints with the same recursion pat- 
tern. This yields the packs: ii, Ai, lA and AA. 
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Figure 5: AA constraints. 

Results are reported only for the later, since 
it is the most informed constraint set. 

We also compared our mapping with the 
SenseMap provided by Princeton]^, and the 
coincidence was quite high, specially in the 
cases in which SenseMap has a high confi- 
dence score. Details can be found in sec- 
tion 



4.1. 



In order to perform the comparison, we had 
to convert SenseMap, which is a sense map- 
ping (that is, it maps each variant in wn1.5 
to a variant in WNl.6), into a synset map- 
ping, which is what our algorithm does. Since 
synsets are coarser than senses, the conver- 
sion is straightforward. When two senses in 
the same 1.5 synset were assigned two senses 
in different 1.6 synsets, we took both targets 
as valid, slightly increasing the remaining am- 
biguity of SenseMap. 

The results are computed over the synsets 
with at least one candidate connection, which 
represent 99.1% of wn1.5. We consider am- 
biguous synsets those with more than one 
candidate connection. 

Table || presents the amount of nodes for 
which disambiguation is performed, and some 
candidate connections discarded (i.e. they do 
not keep as possible all the candidates). 



SenseMap 

RL 



ambiguous 



98.0% 
99.8% 



overall 



99.2% 
99.9% 



Table 1: Coverage of wn1.5 for both map- 
pings. 

Table y presents an estimation of how many 



of those assignment were right, as well as the 
precision for SenseMap, computed under the 
same conditions. Those figures were com- 
puted by manually linking to wn1.6 a sample 
of 1900 synsets randomly chosen from WNl.5, 
and then use this sample mapping as a refer- 
ence to evaluate all mappings. These figures 
show that our system performs a better map- 
ping than SenseMap. The difference between 
both mappings is significant at a 95% confi- 
dence level. 





ambiguous 


overall 


SenseMap 


93.3%-96.9% 


96.9%-98.6% 


RL {6 = 0.3) 
RL {5 = 0.4) 
RL {6 = 0.5) 


96.5%-97.7% 
97.0%-97.6% 
97.2%-97.6% 


98.4%-98.9% 
98.6%-98.9% 
98.7%-98.9% 



^See WN web page at 



http://www.cogsci.princeton.edu/"' wn/ 



Table 2: Precision-recall results for both 
wn1.5-wn1.6 mappings. 

Since relaxation labeling performs a weight 
assignment for each possible connection, we 
can control the remaining ambiguity (and 
thus the recall/precision tradeoff) by chang- 
ing the threshold (6) that the weight for a 
connection has to reach to be considered a so- 
lution. Although higher thresholds maintain 
recall and produce a higher precision, differ- 
ences are not statistically significant. 

4.1 Coincidence of Both Mappings 

For each confidence group in the Prince- 
ton mapping, the soft agreement column in 
table ^ indicates the percentage of wn1.5 
synsets in which our system proposes at least 
one connection also proposed by the Prince- 
ton mapping. The hard agreement column in- 
dicates the amount of connections proposed 
by our system also proposed by Princeton 
mapping. 

The agreement between both systems is 
quite high, specially for the groups with a 
high confidence level. This is quite reason- 
able, since a perfect system would be expected 
to agree with the assignments in 20% confi- 
dence group of SenseMap only about 20% of 
the times. It also must be taken into account 
that for low confidence groups, SenseMap is 
much more ambiguous. 



confidence 






Agreement 






5 = 


0.3 


6 = 


0.4 


5 = 


0.5 


group 


hard 


soft 


hard 


soft 


hard 


soft 


monosemous 


96.9% 


97.3% 


97.0% 


97.3% 


97.1% 


97.2% 


100% 


88.6% 


90.4% 


89.1% 


90.1% 


89.5% 


89.8% 


90% 


87.9% 


89.8% 


88.4% 


89.3% 


88.7% 


89.1% 


80% 


69.3% 


70.2% 


70.1% 


70.5% 


70.4% 


70.4% 


70% 


76.5% 


78.0% 


76.4% 


77.6% 


76.5% 


76.8% 


60% 


53.8% 


53.8% 


53.8% 


53.8% 


53.8% 


53.8% 


50% 


68.4% 


81.2% 


72.7% 


77.4% 


72.7% 


77.4% 


40% 


50.7% 


50.8% 


50.8% 


50.8% 


50.8% 


50.8% 


30% 


65.3% 


65.3% 


65.3% 


65.3% 


65.3% 


65.3% 


20% 


32.6% 


32.6% 


32.6% 


32.6% 


32.6% 


32.6% 


subtotal 


87.3% 


89.1% 


87.8% 


88.8% 


88.3% 


88.6% 


Total 


93.6% 


94.5% 


93.8% 


94.4% 


94.1% 


94.2% 



Table 3: Agreement between both mappings. 



The average remaining ambiguity in 
Princeton mapping and in the mapping per- 
formed by the relaxation labeling algorithm is 
shown respectively in columns SenseMap am- 
biguity and RL ambiguity of table §. 

Our system proposes, in most cases, a 
unique WNl.6 synset for each wn1.5 synset. 
The average ranges from 1.001 to 1.007 pro- 
posals per synset depending on the chosen 5 
threshold, while the Princeton mapping has 
an average of 1.007. 

Summarizing, the obtained results point 
that our system is able to produce a less am- 
biguous assignment than SenseMap, with a 
significantly higher accuracy and wider cov- 
erage. 

In addition, our system only uses structural 
information (namely, hyper/hyponymy rela- 
tionships) while SenseMap uses synset words, 
glosses, and other information in WordNet. 
On the one hand, when information other 
than taxonomy structure is used results might 
be even better. On the other hand, for cases 
in which such information is not available 
(e.g. further development of EuroWordNets 
in new languages), structure may provide a 
reliable basis. 



5 Conclusions & Further Work 

We have applied the relaxation labeling al- 
gorithm to assign an appropriate node in a 
target taxonomy to each node in a source tax- 
onomy, using only hyper/hyponymy informa- 
tion. 

Results on wn1.5 to wn1.6 mapping have 
been reported. The high precision achieved 
provides further evidence that this technique 
-previously used in ( paude et al., 1999D to 
link a Spanish taxonomy to wn1.5- consti- 
tutes an accurate method to connect tax- 
onomies, either for the same or different lan- 
guages. Further extensions of this technique 
to include information other than structural 
may result in a valuable tool for those con- 
cerned with the development and improve- 
ment of large lexical or semantic ontologies. 

The results obtained up to now seem to in- 
dicate that: 

• The relaxation labeling algorithm is a 
good technique to link two different hier- 
archies. For each node with several possi- 
ble connections, the candidate that best 
matches the surrounding structure is se- 
lected. 

• The structural information provides 
enough knowledge to accurately link tax- 
onomies. Experiments on mapping tax- 



Confidence 




SenseMap 


RL 


ambiguity 


group 


Size 


ambiguity 


5 = 0.3 


(5 = 0.4 


(5 = 0.5 


monosemous 


45807 


1.003 


1.001 


1.001 


1.001 


100% 


20075 


1.000 


1.020 


1.011 


1.003 


90% 


2977 


1.007 


1.022 


1.010 


1.005 


80% 


326 


1.080 


1.018 


1.009 


1.000 


70% 


249 


1.024 


1.020 


1.012 


1.004 


60% 


93 


1.043 


1.000 


1.000 


1.000 


50% 


32 


1.063 


1.125 


1.064 


1.064 


40% 


67 


1.448 


1.031 


1.015 


1.000 


30% 


65 


1.569 


1.000 


1.000 


1.000 


20% 


209 


2.215 


1.031 


1.025 


1.020 


subtotal 


24093 


1.016 


1.020 


1.011 


1.003 


Total 


69900 


1.007 


1.007 


1.006 


1.001 



Table 4: Average remaining ambiguity of both mappings. 



onomies automatically extracted from a 



Spanish MRD to WNl.5 (|Daude et al. 



1999| ) show that the technique may be 



useful even when both taxonomies belong 
to different languages or have structures 
less similar than in the case reported in 
this paper. 

• The system produces a good assign- 
ment for WN mapping, based only on 
hyper/hyponymy relationships, which is 
specially useful when no other informa- 
tion is available (i.e. in the case of map- 
ping the EuroWN hierarchies). The re- 
maining ambiguity is low with a high ac- 
curacy, and precision-recall tradeoff may 
be controlled by adjusting the 6 thresh- 
old. 

Some issues to address for improving the al- 
gorithm performance, and to exploit its pos- 
sibilities are: 

• Use other relationships than hy- 
per/hyponymy as constraints to select 
the best connection. Relationships as 
sibling, cousin, etc. could be used. In 
addition, WN provides other relation- 
ships such as synonymy, meronymy, 
etc. which could also provide useful 
constraints. 

• Use other available information, such as 



synset words, glosses, etc. in the WN to 
WN mapping task. 

• Link the verbal, adjectival, and adverbial 
parts of WNl.5 and wn1.6. 

• Test the performance of the technique to 
link other structures (e.g WN-EDR, WN- 
LDOCE, dutch- WN, Italian- WN, . . . ). 

• Use it to link taxonomies for new lan- 
guages to EuroWordNet. 

• Give a step beyond the source-to-target 
vision, and map the taxonomies in a 
symmetrical philosophy, that is, each 
node of each taxonomy is assigned to 
a node in the other taxonomy. This 
should increase the coverage, and rein- 
force/discard connections that may be 
weak when assigned only in one direc- 
tion. This could even open the doors to 
many-to-many taxonomy mapping. 
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